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Speculation 


Is Ideal for... 
- JavaScript 
- Java 

- Smalltalk 
- Python 

- Ruby 


- Scheme 


...many dynamic languages... 
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Intuition 


Leverage traditional compiler technology 
to make dynamic languages as fast as possible. 


Traditional Compiler 
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int foo(int a, int b) 
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function foo(a, b) 
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return a + b; 
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Optimized JS function 


function foo(a, b) 

1 
speculate(isInt32(a)); 
speculate(isInt32(b)); 
return a + b; 
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Speculation with Control Flow Diamond 


Branch(isInt32(value)) 


Call(slow path) 
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Speculation with OSR 
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Traditional Compiler + Enhancements 


Speculation with OSR 


Traditional Compiler + Enhancements 


Speculation with OSR 


Traditional Compiler + Enhancements 


Speculation Has A 
Function Granularity Bias 


e Compiler sees single-entrypoint function + inlines. 


e Speculations exit the function and rarely reenter. 
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JSC's Four Tiers 


latency throughput 


concurrency 


"use strict”; 


let result = 0; 

for (let 1 = 0; 1 < 10000000; ++1) 4 
let o = ff: 1}; 
result += O.f; 
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print(result); 
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concurrency 


LUInt 


"use strict"; 


let result = 0; 

for (let 1 = 0; 1 < 10000000; ++1) 4 
let o = {f: 1}; 
result += O.f; 
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print(result); 
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result += O.f; 
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for (let 1 = 0; 1 < 10000000; ++1) 4 
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"use strict”; 


let result = 0; 

for (let 1 = 0; 1 < 10000000; ++1) 4 
let o = ff: 1}; 
result += O.f; 
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"use strict”; 


let result = 0; 

for (let 1 = 0; 1 < 10000000; ++1) 4 
let o = ff: 1}; 
result += O.f; 


} 


print(result); 


0.12ms 
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"use strict”; 


let result = 0; 

for (let 1 = 0; 1 < 10000000; ++1) 4 
let o = ff: 1}; 
result += O.f; 


} 
print(result); 
0.12ms 0.48ms 
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"use strict”; 


let result = 0; 

for (let 1 = 0; 1 < 10000000; ++1) 4 
let o = ff: 1}; 
result += O.f; 


} 
print(result); 
0.12ms 0.48ms 2.80ms 
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Execution Time = (3.97 ns) x (Bytecodes in LLint) 
+ (1.71 ns) x (Bytecodes in Baseline) 
+ (0.349 ns) x (Bytecodes in DFG) 
+ (0.225 ns) x (Bytecodes in FTL) 
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Common IR 


e Frame of reference for profiling 


e Frame of reference for OSR 


Bytecode 


JSC Bytecode 


Register-based 
Compact 

Untyped 

High-level 

Directly interpretable 


Transformable 


Register-based 


add result, left, right 


result = left + right 


Compact 
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result = left + right 


Untyped 


result = left + right 


High-level 


result = left + right 
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Transformable 
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"use strict”; 


let result = 0; 

for (let 1 = 0; 1 < 10000000; ++1) 4 
let o = ff: 1}; 
result += O.f; 


} 
print(result); 
0.12ms 0.48ms 2.80ms 
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e Execution Counting 
e Exit Counting 


e Recompilation 


Execution Counting 
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Another Example: 
_handlePropertyAccessExpression#D5n0Sd 


_handlePropertyAccessExpression#D5n0Sd 


function (result, node) 
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result. 
result. 
result. 
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result. 
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result. 
result. 
result. 
result. 


possibleGetOverloads = node.possibleGetOverloads; 
possibleSetOverloads = node.possibleSetOverloads; 
possibleAndOverloads = node.possibLeAndOver Loads; 

baseType = Node.visit(node.baseType, this); 

callForGet = Node.visit(node.callForGet, this); 
resultTypeForGet = Node.visit(node.resultTypeForGet, this); 
callForAnd = Node.visit(node.callForAnd, this); 


resultTypeForAnd = Node.visit(node.resultTypeForAnd, this); 
callForSet = Node.visit(node.callForSet, this); 

errorForSet = node.errorForSet; 

update(alls(): 


_handlePropertyAccessExpression#D5n0Sd 
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Control 


e Execution Counting 
e Exit Counting 


e Recompilation 


Bytecode 
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Profiling Tier 


e Non-speculative execution engine(s) 


e Profiling 


Profiling 


latency throughput 


Low Level Interpreter 


macro llintJumpTrueOrFalseOp(name, op, conditionOp) 
llintOpWithJump(op_%name%, op, macro (size, get, jump, dispatch) 
get(condition, t1) 
loadConstantOrVariable(size, ti, tQ) 
btgnz tQ, -0xf, .slow 
conditionOp(t0, .target) 
dispatch() 


. target: 
jump(targetj 


.slow: 
callSlowPath( Llint slow path. %name%) 
nextinstruction() 

end) 

end 


Low Level Interpreter 


macro LlintJumpTrueOrFalseOp(name, op, conditionOp) 
llintOpWithJump(op_%name%, op, macro (size, get, jump, dispatch) 
get(condition, t1) 
loadConstantOrVariable(size, ti, t0) 
btanz t0, -0xf, .slow 
condition0p(t0, .target) 
dispatch() 


. target: 
jump(target) 


„slow: 
callSlowPathC Llint slow path %name%) 
nextInstruction() 

end) 

end 


Baseline JIT 


7] add 
0x2f8984601065: 
0x2f8084601a69: 
0x2f8084601a06d: 
0x2f8084601a70: 
0x2f8084601a76: 
0x2f8084601a79: 
0x2f8084601a7f : 
0x2f8084601a81 : 
0x2f8084601a83: 
0x2f8084601a89: 
0x2f8084601a8c: 


loco, argl, arg? 
mov 0x30(%rbp), %rs1 
mov 0x38(%rbp), %rdx 
cmp %r14, %rsi 
jb 0x2f8084601af2 
cmp %r14, %rdx 
jb 0x2f8084601af2 
mov %esl, %eax 
add %edx, %eax 
jo 0x2f8084601af2 
or %r14, %rax 
mov %rax, -0x38(%rbp) 


Profiling Goals 


e Cheap 


e Useful 


Useful Profiling 


e Speculation is a bet. 


e Profiling makes it a value bet. 


Winning In the Average 


Expected Value of Bet = pxB-(1 - p) x G 


Cb Meaning 
Probability of Winning 


Benefit of winning (positive) 
Cost of losing (positive) 


Winning In the Average 


Good bet iff px B - (1- p)x C>0 


Variable Meaning 
Probability of Winning 


Benefit of winning (positive) 
Cost of losing (positive) 


Winning at Speculation 


Good speculation iff p x B - (1 -p)xC>0 
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Execution Time = (3.97 ns) x (Bytecodes in LLInt) 


+ (1.71 ns) x (Bytecodes in Baseline) 
B < 1.71 40.225 = 1.48 ns 


+ (0.349 ns) x (Bytecodes in DFG 
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Execution Time = (3.97 ns) x (Bytecodes in LLInt) 


+ (1.71 ns) x (Bytecodes in Baseline) 
+ (0.349 ns) x (Bytecodes in DFG) 
+ (0.225 ns) x (Bytecodes in FTL) 


B < 1.71 - 0.225 = 1.48 ns 
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Good speculation iff p x B - (1 -p)xC>0 


Time Saved by Speculation 
Time Lost by Speculation Failure 


C = (OSR exit cost) + (Bytecodes before reentry) xB + 
(Recompile Cost) / (exits to recompile) 


Winning at Speculation 


Good speculation iff p x B - (1 -p)xC>0 


Time Saved by Speculation 
DFG ~ 2499 ns 
Time Lost by Speculation Failure en 


C = (OSR exit cost) + (Bytecodes before reentry) xB + 
(Recompile Cost) / (exits to recompile) 


Winning at Speculation 


Good speculation iff p x B - (1 -p)xC>0 


Time Saved by Speculation 
Time Lost bv S lation Failur DFG - 2499 ns 
ime Lost by Speculation Failure FTL = 9998 ns 


C = (OSR exit cost) + (Bytecodes before reentry) x B + 
(Recompile Cost) / (exits to recompile) 


Winning at Speculation 


Good speculation iff p x B - (1 -p)xC>0 


- E Good speculation iff 
Probability of Winning p > 0.9994 


Time Saved by Speculation 
Time Lost bv S lation Failur DFG - 2499 ns 
ime Lost by Speculation Failure FTL = 9998 ns 


C = (OSR exit cost) + (Bytecodes before reentry) x B + 
(Recompile Cost) / (exits to recompile) 


Winning at Speculation 


Good speculation iff p x B - (1 -p)xC>0 


Probability of Winning Hope ajú iff 


Time Saved by Speculation 
Time Lost bv S lation Failur DFG - 2499 ns 
ime Lost by Speculation Failure FTL = 9998 ns 


C = (OSR exit cost) + (Bytecodes before reentry) x B + 
(Recompile Cost) / (exits to recompile) 


Winning at Speculation 


Only speculate if we believe that we will win 
every time. 


Winning at Speculation 


Profiling should record counterexamples to 
useful speculations. 


Winning at Speculation 


Profiling should run for a long time. 


Winning at Speculation 


Don't stress when speculation fails, unless it 
fails in the average. 


Profiling Sources in JSC 


e Case Flags 

e Case Counts 
e Value Profiling 
e Inline Caches 
e \Vatchpoints 


e Exit Flags 


Profiling Sources in JSC 


e Case Flags — branch speculation 


e Case Counts — branch speculation 


Value Profiling — type inference of values 
e Inline Caches — type inference of object structure 
e Watchpoints — heap speculation 


e Exit Flags — speculation backoff 


Case Flags 


Case flag = tells if a counterexample to a 
speculation ever happened. 


Case Flags 


class StructureStublnfo 4 


ALWAYS INLINE bool considerCaching( 
CodeBlock* codeBlock, Structure* structure) 
1 
// We never cache non-cells. 
if (Istructure) 4 
sawNonCell = true; 
return false; 


Case Flags 


void ArithProfile::emitSetDouble(CCallHelpers& jit) const 
t 
if (shouldEmitSetDouble()) 
jit.or32( 

CCallHelpers::TrustedImm32( 
ArithProfile::Int320verflow | 
ArithProfile::Int520verflow | 
ArithProfi le: :NegZeroDouble | 
ArithProfi le: :NonNegZeroDouble), 

CCal lHelpers: :AbsoluteAddressCaddressOfBits())); 


Why infer int32? 


template<typename T, typename U> 

void multiply(Mat<T>& result, 
const Mat<T>& left, 
const Mat<T>& right) 


for (U resultColumn = result.numColumns(); resultColumn--;) 4 
for (U resultRow = result.numRows(); resultRow--;) 4 
T& resultCell = result.at(resultRow, resultColumn); 
resultCell = TO; 
for (U i = left.numColumns(); 1--:) 4 
resultCell += 
left.at(resultRow, i) * 
right.at(1, resultColumn); 


Nanoseconds to multiply 


10x10 matrix 


template<typename T, typename U> 
void multiply(Mat<T>& result, 


} 


const Mat<T>& left, 
const Mat<T>& right) 


for (U resultColumn = result.numColumns(); resultColumn--;) 4 
for (U resultRow = result.numRows(); resultRow--;) 4 
T& resultCell = result.at(resultRow, resultColumn); 
resultCell = TO; 
for (U i = left.numColumns(); i--;) 4 
resultCell += 
left.at(resultRow, 1) * 
right.at(i, resultColumn); 


10x10 matrix multiply with 
different element types 


int 


double Checkedint 


Nanoseconds to multiply 


10x10 matrix 


template<typename T, typename U> 
void multiply(Mat<T>& result, 


} 


10x10 matrix multiply with 
different element types 


int 


const Mat<T>& left, 
const Mat<T>& right) 


for (U resultColum = result.numColumns(): resultColumn--;) 4 
for (U resultRow = result.numRows(); resultRow--;) 4 
TA resultCell = result.at(resultRow, resultColumn); 
resultCell - T(): 
for (U 1 = left.numColumns(): 1--:) 4 
resultCell +- 
left.at(resultRow, 1) * 
right.at(i, resultColumn); 


10x10 int matrix multiply with 
different index types 


Nanoseconds to multiply 
10x10 matrix 
o 
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double Checkedint int double Checkedint 


Nanoseconds to multiply 


10x10 matrix 


template<typename T, typename U> 
void multiply(Mat<T>& result, 


} 


const Mat<T>& left, 
const Mat<T>& right) 


for (U resultColumn = result.numColumns(); resultColumn--;) { 
for (U resultRow = result.numRows(): resultRow--;) 4 
TA resultCell = result.at(resultRow, resultColumn); 


resultCell = TO; 


for (U i = left.numColumnsC); i--;) 4 


resultCell += 


left.at(resultRow, i) + 


right.at(i, resultColumn); 


10x10 matrix multiply with 
different element types 


int 


double Checkedint 


Nanoseconds to multiply 


10x10 matrix 


10x10 int matrix multiply with 
different index types 


double ICheckedint 


Use Int32 whenever possible to avoid future double> int conversions 


Case Flags Example: Add 


intiż t left = …; 

int32 t right =..; 
ArithProfile" profile 
int32 t intResult; 

JSValue result; 

if (UNLIKELY(addOverf Lowed( 


left, right, 
&intResult))) 4 

result = jsNumber( 
double(left) + 
double(right)); 

profi Le->setObservedInt320verf LowC) ; 

+ else 
result = jsNumber(intResult); 


Case Flags Example: Add 


intiż t left = …; 
int32 t right =..; 
ArithProfile* profile 
1nt32_t intResult; 
JSValue result; 
if (UNLIKELY(addOverflowed( 
left, right, 
&intResult))) 4 
result = jsNumber( 
double(left) + 
double(right)); 
profi le->setObservedInt320verf Low); 
+ else 
result = jsNumber(intResult); 


Case Flags Example: Add 


intiż t left = …; 
int32 t right =..; 
ArithProfile* profile 
1nt32_t intResult; 
JSValue result; 
if (UNLIKELY CaddOverf Lowed( 
left, right, 
&intResult))) 4 
result = jsNumber( 
double(left) + 
double(right)); 
profi le->setObservedInt320verf Low; 
+ else 
result = jsNumber(intResult); 


Case Flags Example: Add 


intiż t left = …; 
int32 t right =..; 
ArithProfile* profile 
1nt32_t intResult; 
JSValue result; 
if (UNLIKELY CaddOverf Lowed( 
left, right, 
&intResult))) 4 
result = jsNumber( 
double(left) + 
double(right)); 
profi Le->setObservedint320verf Low); 
+ else 
result = jsNumber(intResult); 


Case Flags Example: Add 


intiż t left = …; 
int32 t right =..; 
ArithProfile* profile 
1nt32_t intResult; // Af lprofile--didbservelnt320verfLon() 
JSValue result; 
if (UNLIKELY(addOverflowed( intiż t left = ..; 
left, right, int32 t right = …; 
&intResult))) { int32_t result; 
result = jsNumber( speculate( !addOverflowed( 
double(left) + left, right, 
double(right)): &result)); 
profi le->setObservedInt320verf Low); 
+ else 
result = jsNumber(intResult); 


Case Flags Example: Add 


intiż t left = …; 
int32 t right =..; 
ArithProfile* profile 
1nt32_t intResult; 
JSValue result; 
if (UNLIKELY CaddOverf Lowed( 
left, right, 
&intResult))) 4 
result = jsNumber( 
double(left) + 
double(right)); 
profi Le->setObservedInt320verf Low); 
+ else 
result = jsNumber(intResult); 


// Af profile—didObserveInt320verf Low) 


double left =...; 
double right = ...; 
double result = left + right; 


Case Counts 


RareCaseProfile* rareCaseProfile = 0; 
if (shouldEmitProfiling()) í 
rareCaseProfile = 
m codeBlock->addRareCaseProfile(m bytecodeOffset); 


j 
if (shouldEmi tProfiling()) 4 
add32( 
TrustedImm32(1), 


AbsoluteAddress(&rareCaseProfile-»m counter)); 


Rare Case Count 
Thresholds 


Count Threshold 


Assume th1s is exotic. 


Assume integer math 
overflowed to double. 


Profiling Sources in JSC 


e Case Flags — branch speculation 


e Case Counts — branch speculation 


Value Profiling — type inference of values 


e Inline Caches — type inference of object structure 
e \Vatchpoints — heap speculation 


e Exit Flags — speculation backoff 


Value Profiling 


macro valueProfile(op, metadata, value) 
storeg value, %op%: :Metadata: :profile.m buckets [metadata | 
end 
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e Use static analysis whenever possible. 


e Value profiling fills in the blanks: 
- loads 
- calls 


— etc 
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Value Profiling 


e Combined with prediction propagation 


e Provides predicted type inference 
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Profiling Sources in JSC 


e Case Flags — branch speculation 


e Case Counts — branch speculation 


Value Profiling — type inference of values 
e Inline Caches — type inference of object structure 
e Watchpoints — heap speculation 


e Exit Flags — speculation backoff 
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var o = ff: 5, g: 6); 


= 1f (o->structureID == 42) 
v = o->inlineStorage[0] 


else 
v = slowGet(o, “f”) 


var v = o.f; 


“Inline Cache” 


var o = ff: 5, g: 6}; 


Property Table + 
g: inline(1) 
B —- if (o->structurelD == 42) 


var v = o.f; 


v = o->inlineStorage| 0 | 
else 


v = slowGet(o, “f”) 


Interpreter Inline Cache 


get by id <result>, <base>, <properyName> 


Interpreter Inline Cache 


get by 1d <result>, <base>, <properyName>, 
<cachedStructurelD>, <cachedOffset> 


Interpreter Inline Cache 


get by 1d loc42, loc43, “g”, 
0, 0 
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get by id loc42, loc43, “g”, 
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get by id loc42, loc43, “g”, 
42,1 
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JIT Inline Cache 


0x46f8c30b9b0: mov 0x30(%rbp), %rax 
0x46f8c30b9b4: test %rax, %r15 
0x46f8c30b9b7: jnz 0x46f8c30ba2c 
0x46f8c30b9bd: jmp 0x46f8c30ba2c 
0x46f8c30b9c2: 016 nop %cs:0x200(%rax,%rax) 
0x46f8c30b9d1: nop (%rax) 

0x46f8c30b9d4: mov %rax, -0x38(%rbp) 


JIT Inline Cache 


0x46f8c30b9b0: mov 0x30(%rbp), %rax 
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ELIA, jnz 0x46F8c30bazc 


Ox46f8c30b9c2: 016 nop %cs:0x200(%rax,%rax) 
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MEM nop (rax) 


JIT Inline Cache 


0x46f8c30b9b0: mov 0x30(%rbp), %rax 
0x46f8c30b9b4: test %rax, %r15 
0x40f 8c30b9b7: jnz 0x46f8c30ba2c 
0x46f8c30b9bd: cmp $0x125, (%rax) 
0x46f8c30b9c3: jnz 0x46f8c30ba2c 
O0x46f8c30b9c9: mov 0x18(%rax), %rax 


nee nop 0x200 Corax) 


Inline caches implicitly collect profiling information. 


| jmp Lslow | 
get. by. id Jmp Lslow | 
jmp Lslow 


get by id 
m 91, 0 


jmp Lslow | 
jmp Lslow | 
jmp Lslow 


jmp Lslow 


cmp S1, | 
get by id (%rax) jmp Lslow 
pus jnz Lslow 
~» 51, 0 mov 10(9 rax), 
%rax 


cmp S1, 

(%rax) jmp Lslow 
jnz Lslow 
mov 10(%rax), 

%rax 


cmp S1, 
get. by id Grax) 
jnz Lslow 
dala S1, 9 mov 10(%rax), 
%rax 


cmp S1, 
(%rax) 

jnz Lslow 

mov 10(%rax), 
%rax 


cmp S1, 
(%rax) 

jnz Lslow 

mov 10(%rax), 
%r ax 


cmp S1, 

get by id Grax) 
jnz Lslow 

dala S1, 9 mov 10(%rax), 
%rax 


get by id 
m 91, 0 


get by id 
«, 91, 0 


jmp Lslow 


cmp S1, 
get_by_id |. Cine) 
jnz Lslow 
ms S1, 0 mov 10(%rax), 
%r ax 


cmp S1, 

get by id Grax) 
Jnz Lslow 

dala S1, 9 mov 10(%rax), 
%rax 


cmp S1, 

get by id ra 
jnz Lslow 

dala S1, 9 mov 10(%rax), 
%rax 


cmp S1, 

get by id ra 
jnz Lslow 

dala S1, 9 mov 10(%rax), 
%rax 


cmp S1, 

get by id ra 
jnz Lslow 

dala S1, 9 mov 10(%rax), 
%rax 


tmp = o.f 
tmp2 = 0.9 


get_by_id 


get by id 
m 91, 0 


get by id 


tmp = o.f 
tmp2 = o.g 


get by id 
wy S1, 9 


get by id 
Il 


get by id jmp Lslow 


… S1, 0 


jmp Lslow 


get by 1d 
E 


get by id 
wy 91, 0 


get bv.id 
wy SL, 1 


cmp S1, 
(%rax) 

jnz Lslow 

mov 10(%rax), 
%rax 


jmp Lslow 


get by id 
w, 91, 0 


get by 1d 
w, 91, 1 


cmp S1, 
(%rax) 

jnz Lslow 

mov 10(%rax), 
%rax 


cmp S1, 
(%rax) 

jnz Lslow 

mov 18(%rax), 
%rax 


get by id 
m 91, 0 


get by id 
i, 515-1 


cmp S1, 
(%rax) 

jnz Lslow 

mov 10(%rax), 
%rax 


cmp S1, 
(%rax) 

jnz Lslow 

mov 18(%rax), 
%rax 


get by id 
m 91, 0 


get by id 
i, 515-1 


cmp S1, 
(%rax) 

jnz Lslow 

mov 10(%rax), 
%rax 


cmp S1, 
(%rax) 

jnz Lslow 

mov 18(%rax), 
%rax 


get by id 
m 91, 0 


get by id 
i, 515-1 


cmp S1, 
(%rax) 

jnz Lslow 

mov 10(%rax), 
%rax 


cmp S1, 
(%rax) 

jnz Lslow 

mov 18(%rax), 
%rax 


get by id 
m 91, 0 


get by id 
i, 515-1 


cmp S1, 
(%rax) 

jnz Lslow 

mov 10(%rax), 
%rax 


cmp S1, 
(%rax) 

jnz Lslow 

mov 18(%rax), 
%rax 


get by id 
m 91, 0 


get by id 
i, 515-1 


cmp S1, 
(%rax) 

jnz Lslow 

mov 10(%rax), 
%rax 


cmp S1, 
(%rax) 

jnz Lslow 

mov 18(%rax), 
%rax 


get by id 
m 91, 0 


get by id 
i, 515-1 


cmp S1, 
(%rax) 

jnz Lslow 

mov 10(%rax), 
%rax 


cmp S1, 
(%rax) 

jnz Lslow 

mov 18(%rax), 
%rax 


get by id 
m 91, 0 


get by id 
i, 515-1 


cmp S1, 
(%rax) 

jnz Lslow 

mov 10(%rax), 
%rax 


cmp S1, 
(%rax) 

jnz Lslow 

mov 18(%rax), 
%rax 


Inline Cache Control Flow 


O 


e an | 
RS 


p 
ème 


not redundant! 


— = 


Inline Cache Control Flow Inlined with OSR exits 


Branch(Equal( Branch(Equal( 
@structure, $51)) @structure, $51)) 
Load(@object, Load(@object, 
offset = 0x10) aa lá offset = 0x10) 
... things... ... things... 
Branch(Equal( Branch(Equal( 
@structure, $51)) @structure, $51)) 
Load(@object, Load(@object, 
offset = 0x18) alá offset — 0x18) 


NS. | 


« more things ... « more things ... 


Inline Cache Control Flow Inlined with OSR exits 


— 
e an | 


NINA 
ème 


4 


— mum 


Minimorphic IC Inlining 


var tmp = O.f; 


Minimorphic IC Inlining 


m 
e 
a 
Va 
SK 
~ 


var tmp =*o.f; 


Minimorphic IC Inlining 


var tmp -?o^f; 


Minimorphic IC Inlining 


var tmp -?o^f; 


CheckStructure(@o, [S1, S21) 
GetByOffset(@o, “f”, 0) 


Polymorphic IC Inlining 


var tmp = o.f; 


Polymorphic IC Inlining 


m 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
“ 
` ` 
- 
- 
- 
- 
DI 
= 
= 
= 
- 
- 
a 
- 


var tmp 2504: 


Polymorphic IC Inlining 


- 
= 
m 
= 
= 
= 
= 
m 
= 
- 
“ a 
= 


var tmp Ae. 


DFG IR:  MultiGetByOffset(80, "f", [S1, S2] => 0, [S3] => 1) 


Polymorphic IC Inlining 


m 
- 
= 
`~ 
~ 
`~ 
`~ 
~ 
m 
= 
`~ 
~ 
- 
- 
`. 
` ` 
`~ 
`~ 
- 
`~ 
DI 
= 
~ 
= 
`~ 
= 
=, 
- 


var tmp =*0%f; 


DFG IR: MultiGetByOffset(@o, "f", [S1, S2] => 0, [S3] => 1) 


if Co->structureID == S1 || o->structureID == S2) 
result = o->inlineStorage[Q | 

else 
result = o->inlineStoragel 1] 


B3 IR: 


function foo(o) 4 return o.f; + 


function foo(o) 4 return Y. f: 1 


function foo(o) 4 return 35. f: 1 


function bar(p) 4 return foo(p.g): + 


function foo(o) 4 return 35. f: 1 


function bar(p) í return foo(p?9): + 


function foo(o) 4 return Y. f: 1 


function bar(p) í return foo(p39); y 


function foo(o) 4 return 35. f: 1 


function bar(p) í return foo(p?9): } 


jmp Lslow 


function foo(o) 4 return 35. f: 1 


function bar(p) í return foo(p?9): } 


cmp S1, 
(%rax) 


jnz Lslow 
mov 10(%rax), 
%rax 


m m d 
` 
` 
+ 
` Y M 
` ` , 
`. ` 
“is ` fu 
` 
` 
` ` LI 
` 


function foo(o) 4 return Y. f: 1 


function bar(p) í return foo(p?9): + 


cmp S1, 


(%rax) 
jnz Lslow 
mov 10(%rax), 

%rax 


m m d 
` 
` 
+ 
` Y M 
` ` , 
`. ` 
“is ` fu 
` 
` 
` ` LI 
` 


function foo(o) 4 return Y. f: 1 


function bar(p) í return foo(p?9): + 


cmp S1, 


(%rax) 
jnz Lslow 
mov 10(%rax), 

%rax 


Inline Caches 


e Great optimization 
e Implicitly provides profiling data 


e Polyvariant 


Profiling Sources in JSC 


e Case Flags — branch speculation 


e Case Counts — branch speculation 


Value Profiling — type inference of values 
e Inline Caches — type inference of object structure 
e Watchpoints — heap speculation 


e Exit Flags — speculation backoff 


Watchpoints 


Math.pow(42, 2) 


Watchpoints 


Math.pow(42, 2) 


resolve scope 
get from scope 


Watchpoints 


Math.pow(42, 2) 


resolve scope 
get from scope 
get by 1d 


Watchpoints 


Math .pow(42, 2) 


resolve scope 
get from scope 
get by 1d 

call 


Watchpoints 


Math.pow(42, 2) 


resolve scope 
get from scope 
get by id 

call 


Watchpoints 


powfunc(42, 2) 


const(powfunc ) 
call 


Watchpoints 


powfunc(42, 2) 
Math = “wat”; 


const(powfunc ) 
call 


Watchpoints 


Math.pow(42, 2) 


resolve scope Math = “wat”; 
get. from scope 

get by id 

call 


Watchpoints Example #2 


Strength.REQUIRED = new Strength(0, "required" ); 
Strength.STONG PREFERRED = new Strength(1, "strongPreferred"); 
Strength.PREFERRED = new Strength(2, preferred"); 
Strength.STRONG DEFAULT = new Strength(3, "strongDefault"); 
Strength . NORMAL = new Strength(4, 'normal'); 
Strength.WEAK DEFAULT = new Strength(5, "weakDefault"); 
Strength .WEAKEST = new Strength(6, 'weakest'); 


Source: deltablue benchmark 


Watchpoints Example #3 


AST.prototype.typeCheck = function (typeFlow) 4 
switch(this.nodeType) 4 
case TypeScript.NodeType.Error: 
case TypeScript.NodeType.EmptyExpr: 4 
this.type = typeFlow.anyType: 
break; 


Source: typescript compiler 


Watchpoints Example #3 


AST.prototype.typeCheck = function (typeFlow) 4 
switch(this.nodeType) 4 
case TypeScript.NodeType.Error: 
case TypeScript.NodeType.EmptyExpr: 4 
this.type = typeFlow.anyType: 
break; 


Source: typescript compiler 


Watchpoints 


e Object Property Conditions (equality, presence, absence, 
etc) 


- relies on structures and ICs 


e Lots of exotic watchpoints 


Profiling Sources in JSC 


e Case Flags — branch speculation 


e Case Counts — branch speculation 


Value Profiling — type inference of values 
e Inline Caches — type inference of object structure 
e Watchpoints — heap speculation 


e Exit Flags — speculation backoff 


Exit Flags 
PR Se 


void LowerDFGToB3:: 
bool Graph:: speculateStringObjectForStructureID 
canOptimizeStringObjectAccess( (Edge edge, LValue structureID) 
const CodeOrigin& codeOrigin) í 
l 


if (hasExitSite( 
code0rigin, speculate( 
NotStringObject)) NotStringObject, 
return false; noValue(), 0, 
m out.notEgual(..)): 


Exit Flags 


void LowerDFGToB3: : 
bool Graph:: speculateStringObjectForStructureID 
canOptimizeStringObjectAccess(  |CEdge edge, LValue structurelD) 
const CodeOrigin& codeOrigin) i 
{ 


if (hasExitSite( 
codeOrigin, speculate( 
NotStringObject)) NotStringObject, 
return false; noValue(), 0, 
m out.notEgual(..)); 


Exit Flags 


void LowerDFGToB3: : 
bool Graph:: speculateStringObjectForStructureID 
canOptimizeStringObjectAccess(  |CEdge edge, LValue structurelD) 
const CodeOrigin& codeOrigin) i 
{ 


if ChasExitSite( 
code0rigin, speculate( 
NotStringObject)) NotStringObject, 
return false; noValue(), 0, 
m out.notEgual(..)); 


Exit Flags 


void LowerDFGToB3: : 
bool Graph:: speculateStringObjectForStructureID 
canOptimizeStringObjectAccess(  |CEdge edge, LValue structurelD) 
const CodeOrigin& codeOrigin) i 
{ 


if (hasExitSite( 
code0rigin, speculate( 
NotStringObject)) NotStringObject, 
return false; noValue(), 0, 
m out.notEgual(..)); 


Profiling Sources in JSC 


e Case Flags — branch speculation 


e Case Counts — branch speculation 


Value Profiling — type inference of values 
e Inline Caches — type inference of object structure 
e Watchpoints — heap speculation 


e Exit Flags — speculation backoff 


Bytecode 


> Lil ba f 


DFG IR 


Source 


function foo(a, b) 


1 


return a + b; 
f 


PY SNN. TR FAA FTF 17 1 


0] 
1 | 
3 | 
6] 
A 
12 | 


bytecode 


enter 

get_scope Loc3 

mov loc4, loc3 
check_traps 

add loco, argl, arg? 


ret 


Loco 


rararararharm 


0 | 
1 | 
3] 
O | 
A 
12 | 


bytecode 


enter 

get_scope Loc3 

mov loc4, loc3 
check_traps 

add loco, argl, arg? 


ret 


loc6 


23: 
24: 
25: 
26: 
28: 


GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 
ArithAdd(Int32:023, Int32:@24, CheckOverflow, Exits, bc#7) 
MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 
Return(Untyped:025, W:SideState, Exits, bc#12) 


DFG FIL 


Fast JIT Powerful JIT 


DFG FIL 


Fast JIT Powerful JIT 


DFG IR 


DFG FIL 


Fast JIT Powerful JIT 


DFG IR DFG IR 


DFG FIL 


Fast JIT Powerful JIT 


DFG IR DFG IR 


DFG SSA IR 


DFG FIL 


Fast JIT Powerful JIT 


DFG IR 


DFG IR 


DFG SSA IR 


B3 IR 


DFG FIL 


Fast JIT Powerful JIT 


DFG IR 


DFG IR 


DFG SSA IR 


B3 IR 


Assembly IR 


DFG FIL 


Fast JIT Powerful JIT 


DFG SSA IR 


B3 IR 


Assembly IR 


DFG FIL 


Fast JIT Powerful JIT 


DFG SSA IR 


B3 IR 


Assembly IR 


DFG Goal 


Remove lots of type checks quickly. 


DFG Goals 


e Speculation 
e Static Analysis 


e Fast Compilation 


DFG Goals 


e Speculation 
e Static Analysis 


e Fast Compilation 


23: 
24: 
25: 
26: 
28: 


DFG IR 


GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 
ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 
MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 
Return(Untyped:025, W:SideState, Exits, bc#12) 


23: 
24: 
25: 
26: 
28: 


DFG IR 


profiling 


GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 
ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 
MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 
Return(Untyped:025, W:SideState, Exits, bc#12) 


23: 
24: 
25: 
26: 
28: 


DFG IR 


Fili 
speculation prong 


GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 
ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 
MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 
Return(Untyped:025, W:SideState, Exits, bc#12) 


23: 
24: 
25: 
26: 
28: 


DFG IR 


Fili 
speculation prong 


GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 
ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 
MovHint(Untyped:@25, loco, W:SideState, ClobbersExit, bc#7, ExitInvalid) 
Return(Untyped:@25, W:SideState, Exits, bc#12) 


OSR 


OSR flattens control 
flow 


OSR is hard 


int foo(int* ptr) 


l 


int 


W 


x 
| 


< 
| 


N 
| 


W, X, y, Z, 


… // lots of 


1s ok(ptr) ? 
… // lots of 


1s ok(ptr) ? 


stuff 


*ptr : slow path(ptr); 
stuff 


*ptr : slow path(ptr); 


return W + X + y + Z, 


int foo(int* ptr) 
i 


int W, X, Y, 4, 


w = … // lots of stuff 


if (lis ok(ptr)) 
return foo basel(ptr, w); 


x = *ptr; 
y = .. // lots of stuff 
Z = *ptr; 


return W + X + y + Z, 


int foo(int* ptr) 
{ 


int W, X, VÆ Z, 


w = .. // lots of stuff 


if (lis ok(ptr)) 


return foo basel(ptr, w); 


e Must know where to exit. 


e Must know what is live-at-exit. 


| 42] add loc7, loc4, loc8 
live after: loc3, loc4, loc7 


[ 42] add loc7, loc4, loc8 
live after: loc3, loc4, loc7 


[ 42] add loc7, loc4, loc8 
live after: loc3, loc4, loc7 


frame layout 
matches bytecode 


[ 42] add loc7, loc4, loc8 
live after: loc3, loc4, loc7 


loc4 > const(42) 
loc8 + %rdx 


frame layout 
matches bytecode 


[ 42] add loc7, loc4, loc8 
live after: loc3, loc4, loc7 


loc4 > const(42) 
loc8 + %rdx 
matches bytecode frame layout 
selected by complex 
process 


frame layout 


[ 42] add loc7, loc4, loc8 
live after: loc3, loc4, loc7 


loc4 > const(42) 
loc8 > %rdx 
matches bytecode frame layout 
selected by complex 
process 


frame layout 


[ 42] add loc7, loc4, loc8 
live after: loc3, loc4, loc7 


stack/register shuffle 
loc4 > const(42) 
0 
frame layout loc8 > %rdx 
matches bytecode frame layout 


selected by complex 
process 


How? 


Leverage Bytecode SSA Conversion 


[ 42] add loc/, loc4, loc8 
live after: loc3, loc4, loc7 


[ 42] add loc7, loc4, loc8 
live after: loc3, loc4, loc7 


case op_add: 4 
VirtualRegister result = instruction->result(); 
VirtualRegister left = instruction->left(); 
VirtualRegister right 1nstructlon->right(): 


stackMap[result] = createAdd( 
stackMap[ left], stackMap[right]); 
break; 


[ 42] add loc7, loc4, loc8 
live after: loc3, loc4, loc7 


case op_add: 4 
VirtualRegister result = instruction->result(); 
VirtualRegister left = instruction->left(); 
VirtualRegister right 1nstructlon->right(): 


stackMap[result] = createAdd( 
stackMap| left], stackMap[right]); 
break; 


stackMap before bc#42 


JSConstant(42) 


[ 42] add loc7, loc4, loc8 
live after: loc3, loc4, loc7 


case op_add: 4 
VirtualRegister result = instruction->result(); 
VirtualRegister left = instruction->left(); 
VirtualRegister right 1nstructlon->right(): 


stackMap[result] = createAdd( 
stackMap| left], stackMap[right]); 
break; 


stackMap after bc#42 


JSConstant(42) 


[ 13] mov loc4, 42 


L 42] add loc7, loc4, loc8 


L 29] get by id loc8, .. 


case op add: { 
VirtualRegister result = instruction->result(); 
VirtualRegister left 1nstruction->left(): 
VirtualRegister right = instruction->right(); 


Map«VirtualRegister, Value*> environment; 
for (VirtualRegister reg : liveNow()) 
environment[reg] = stackMapl reg]: 


stackMap[ reg] = createAdd( 
stackMap| left], stackMap[ right], 
environment); 

break; 


Exit Environment 


Exit Environment 


e The obvious solution. 


Exit Environment 


e The obvious solution. 


e Super widespread. 


Exit Environment 


e The obvious solution. 
e Super widespread. 


e But its awful for JavaScript! 


Exit Frequency Environments Work? 


Yes, they work great! 


Ollive variables) cost is incurred seldom, so 
it’s not a big deal. 


Exit Frequency 


Environments Work? 


Yes, they work great! 


Ollive variables) cost is incurred seldom, so 
it’s not a big deal. 


Not really. 


O(live variables) per instruction is a lot of: 
- data flow edges 
- memory 


Observation: 
environments hardly change between 
instructions. 


Use delta encoding! 


Use imperative delta encoding! 


DFG IR 


23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 
24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 
25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 


28: Return(Untyped:@25, W:SideState, Exits, bc#12) 


L 


23: 
24: 
25: 
26: 


7] add loc6, argl, arg? 


GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 
ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 
MovHint(Untvped:025, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 


23: 
24: 
25: 
26: 


/| add loco, arg1, arg? 


GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack( 77; bc#7) 
ArithAdd(Int32:023, Int32:@24, CheckOverflow, Exits, bc#7) 
MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 


L 


23: 
24: 
25: 
26: 


7] add loc6, arg1, arg? 


GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:StackTé 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7); 
ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 

MovHint(Untvped:025, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 


[ Z] add loc6, argl, arg2 


2), R:Stack(6), bc#7) 


23: GetLocal(Untyped:@1, arg1(B<Int32>/Flushec 
¿Stack(7), bc#7) 


24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedIn 
25: ArithAdd(Int32:023, Int32:024, CheckOverflow, Exits, 


26: MovHint(Untyped:@25, loco, W:SideState, ClobbersExit, #0, ExitInvalid) 


[ Z] add loc6, argl, arg? 


23: GetLocal(Untyped:@1, argl(B<Int32>/FlusMedint32), R:Stack(6), bc#7) 
24: GetLocal(Untyped:@2, arg?(C<BoolInt32>/FlushesInt32), R:Stack(7), bc#7) 


25: ArithAdd(Int32:@23, Int32:024, CheckOverflow, Exi bc 
26: MovHint(Untyped:@25, loco, W:SideState, ClobbersExit (bc#7) ExitInvalid) 


L 


23: 
24: 
25: 
26: 


7] add loc6, argl, arg? 


GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 
ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 
MovHint(Untvped:025, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 


[ 71 add loc6, argl, arg? 


[ 71 add loc6, argl, arg? 


DFG SSA state 


DEG SSA state DFG Exit state 


DEG SSA state DFG Exit state 


DEG SSA state DFG Exit state 


DEG SSA state DFG Exit state 


DEG SSA state DFG Exit state 


DEG SSA state DFG Exit state 


loco := 025 


| exit OK 


| exit OK 
[ 42] add 


| exit invalid 


[ 666] wat 


exit OK 


[ 666] wat 


exit OK 
[ 666] wat 


exit invalid 


exit OK 
[ 666] wat 


exit invalid 


exit OK 
[ 666] wat 


exit invalid 


… more effects ... 


OSR exit state update 


[ 666] wat 
[ 683] bar TS ... more effects ... 


OSR exit state update 


Watchpoints 
= 
InvalidationPoint 


function foo() 4 
return Math.pow(2, 3); 
} 


function foo() 4 
return Math.pow(2, 3); 
} 


+ 


function foo() 4 
return Math.pow(2, 3); 
} 


+ 


function foo() 4 
return Math.pow(2, 3); 
} 


+ 


function foo() 4 
return Math.pow(2, 3); 


} 
+ + 


function foo() 4 
return Math.pow(2, 3); 


} 
+ + 


function foo() 4 
return Math.pow(2, 3); 


5 
= > zli 
Math.pow < "hahaha"; A 


function foo() 4 
return Math.pow(2, 3); 
} 


+ 


Math.pow = “hahaha”; 


function foo() 4 
bar(); 
return Math.pow(2, 3); 


function foo() 4 
bar(); 
return Math.pow(2, 3); 


+ 


function foo() 4 
bar(); 
return Math.pow(2, 3); 


+ 


function foo() 4 
bar(); 
return Math.pow(2, 3); 


+ 


function foo() 4 
bar(); 
return Math.pow(2, 3); 


} 
+ + 


function foo() 4 
bar(); 
return Math.pow(2, 3); 


} 
+ + 


function foo() 4 
bar(); 
return Math.pow(2, 3); 


} 
= > E 
function bar() 4 


if (p) 
Math = 0; 


function foo() 4 


bar (O; 
return Math.pow(2, 3): 


5 
= > E 
function bar() 4 
if (p) 
Math = 
5 


function foo() 4 
bar); 
return Math.pow(2, 3): 


5 
= > E 
function bar() 4 
if (p) 
Math = 0; 
5 


Invalidation Idea 


Walk the stack. 
Repoint return pointers to OSR exit. 
Widespread idea. 


Doesn't work with DFG IR. 


aia ExitOK EE que 


ria ERFURTER E ExItOK sse 


aia ExitOK EE que 


ria ERFURTER E ExItOK sse 


aia ExitOK EE que 


Nowhere to 
exit to! 


ria ERFURTER E ExItOK sse 


InvalidationPoint 


e Deferred invalidation in case an in-progress effect has 
nowhere to exit. 


e Emits no code. 


function foo() 4 
bar(); 
return Math.pow(2, 3); 


} 
+ + 


function foo() 4 
bar(); 
return Math.pow(2, 3); 


} 
= > E 
function bar() 4 


if (p) 
Math = 0; 


function foo() 4 
bar(); 
return Math.pow(2, 3); 


Invalidated 
Optimizing Tier 
Version 


function bar() 4 


if Cp) 
Math = 0; 


DFG Goals 


e Speculation 
e Static Analysis 


e Fast Compilation 


Remove type checks 


Check(Int32:@foo) 


Check(Int32:@foo) 


Check(Int32:@foo) 


Check(Int32:@foo) 


Check(Int32:@foo) Check(Int32:@foo) 


Check(Int32:@foo) 


Check(Int32:@foo) 


Check(Int32:@foo) 


Check(Int32:@foo) Check(Int32:@foo) 


Check(Int32:@foo) 


Check(Int32:@foo) Check(Int32:@foo) 


Abstract Interpreter 


“Global” (whole compilation unit) 
Flow sensitive 


Tracks: 

- variable type 

- Object structure 
- Indexing type 


- constants 


DFG Goals 


e Speculation 
e Static Analysis 


e Fast Compilation 


Fast Compile 


e Emphasis on block-locality. 


e Template code generation. 


e Primary block-local 
data flow graph. 


e Primary block-local 
data flow graph. 


e Secondary global data 
flow graph. 


23: 
24: 
25: 
26: 
28: 


DFG Template Codegen 


GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 
ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 
MovHint(Untyped:@25, loco, W:SideState, ClobbersExit, bc#7, ExitInvalid) 
Return(Untyped:@25, W:SideState, Exits, bc#12) 


23: 
24: 
25: 
26: 
28: 


DFG Template Codegen 


GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 
ArithAdd(Int32:023, Int32:024, CheckOverflow, Exits, bc#7) 
MovHint(Untyped:025, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 
Return(Untyped:025, W:SideState, Exits, bc#12) 


23: 
24: 
25: 
26: 
28: 


DFG Template Codegen 


mo! e 
GetLocal(Untyped:@1, argl(B<Int32>/FlushedInt32), R:Stack(6), bc#7) + add Yes 1 Yeax 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(73 7 bt#7) 9 


ArithAdd(Int32:023, Int32:024, CheckOverflow, Exits, bcH7) 
MovHint(Untvped:025, loc6, W:SideState, ClobbersExit, bc#7 “Exit Invalid) ° 
Return(Untyped:@25, W:SideState, Exits, bc#12) ibi TP ] 


o Lex1t 
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DFG optimization pipeline 


DFG optimization pipeline 


DFG optimization pipeline 


DFG optimization pipeline 


DFG optimization pipeline 


DFG optimization pipeline 
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TI 
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JetStream 2 Score 


on my computer one day 


LLint 


LLInt+Baseline >2x LLInt 


LLint-Baseline-DFG 52x Baseline 


LLInt+Baseline+DFG+FTL ~1.1x DFG 


O 15 30 45 60 75 90 105 120 135 150 
Score (higher is better) 


JetStream 2 Score 


on my computer one day 


LLint 


52x LLInt 


LLInt+Baseline 


>2x Baseline 


LLint-Baseline-DFG 


LLInt+Baseline+DFG+FTL ~1.1x DFG 


O 15 30 45 60 75 90 105 120 135 150 
Score (higher is better) 


DFG FIL 


Fast JIT Powerful JIT 


DFG SSA IR 


B3 IR 


Assembly IR 


DFG FIL 


Fast JIT Powerful JIT 


DFG SSA IR 


B3 IR 


Assembly IR 


FTL Goal 


All the optimizations. 


FIL IRs 


High Level i i 
4“ dst, left, right 
Medium Level dst: B1tOr(Int32:@left, 

Exotic SSA Int32:@right, ..) 


Low Level |Int32 @dst = 
Normal SSA BitOr(@left, @right) 
Architectural o a 

CISC Or32 %src, %dest 


FIL IRs 


High Level i | 
Load/Store bitor dst, left, right 


Medium Level dst: BitOr(Int32:@left, 
Exotic SSA Int32:@right, ..) 
Low Level |Int32 @dst = 
Normal SSA BitOr(@left, @right) 
Architectural o A 

| Fee ora %src, %dest 


FIL IRs 


High Level i i 
Load/Store bitor dst, left, right 


Medium Level dst: BitOr(Int32:@left, 
Exotic SSA Int32:@right, ..) 


Low Level |Int32 @dst = 
Normal SSA BitOr(@left, @right) 
Architectural o a 

CISC Or32 %src, %dest 


FIL IRs 


High Level i i 
MENS dst, left, right 
Medium Level dst: BitOr(Int32:@left, 

Exotic SSA Int32:@right, ...) 


Low Level |Int32 @dst = 
Normal SSA BitOr(@left, @right) 
Architectural o a 

CISC Or32 %src, %dest 


FIL IRs 


High Level i i 
4“ dst, left, right 
Medium Level dst: B1tOr(Int32:@left, 

Exotic SSA Int32:@right, ...) 


Low Level  Int32 @dst = 
Normal SSA BitOr(@left, right) 
Architectural o a 

CISC Or32 %src, %dest 


FTL optimization pipeline 


DFG IR 


B3 IR 


Double-to-Float 
simplify (folding, CFG, etc) 
LICM 
Global CSE 
Switch Inference 
Tail Duplication 
Path Constants 
Macro Lowering 
Legalization 
Constant Motion 


Lower to Air (isel) 


Air 


Simplify CFG 
Macro Lowering 
DCE 
Graph Coloring Reg Alloc 
SPESE 

Graph Coloring Stack Alloc 

Heport Used Registers 

Fix Partial Register Stalls 
Lower Multiple Entrypoints 

Select Block Order 


Emit Machine Code 


FTL optimization pipeline 


B3 IR Air 
Double-to-Float simplify CFG 
simplify (folding, CFG, etc) Macro Lowering 
LICM DCE 
Global CSE Graph Coloring Reg Alloc 
Switch Inference AE SE 
Tail Duplication Graph Coloring Stack Alloc 
Path Constants Report Used Registers 
Macro Lowering Fix Partial Register Stalls 
Legalization Lower Multiple Entrypoints 
Constant Motion Select Block Order 


Lower to Air (isel) Emit Machine Code 


FTL optimization pipeline 


B3 IR Air 
Double-to-Float simplify CFG 
Simplify (folding, CFG, etc) Macro Lowering 
LICM DCE 
Global CSE Graph Coloring Reg Alloc 
Switch Inference SIE SE 
Tail Duplication Graph Coloring Stack Alloc 
Path Constants Report Used Registers 
Macro Lowering Fix Partial Register Stalls 
Legalization Lower Multiple Entrypoints 
Constant Motion Select Block Order 


Lower to Air (isel) Emit Machine Code 


FTL optimization pipeline 


B3 IR Air 
Double-to-Float simplify CFG 
Simplify (folding, CFG, etc) Macro Lowering 
LICM DCE 
Global CSE Graph Coloring Reg Alloc 
Switch Inference SIE SE 
Tail Duplication Graph Coloring Stack Alloc 
Path Constants Report Used Registers 
Macro Lowering Fix Partial Register Stalls 
Legalization Lower Multiple Entrypoints 
Constant Motion Select Block Order 


Lower to Air (isel) Emit Machine Code 


FTL optimization pipeline 


B3 IR Air 
Double-to-Float Sell or 
Simplify (folding, CFG, etc) Macro Lowering 
LICM DCE 
Global CSE Graph Coloring Reg Alloc 
Switch Inference AE SE 

Tail Duplication Graph Coloring Stack Alloc 
Path Constants Report Used Registers 
Macro Lowering Fix Partial Register Stalls 
Legalization Lower Multiple Entrypoints 
Constant Motion 


Select Block Order 
| Lower to Air (isel) Emit Machine Code 
| E | | 


FTL optimization pipeline 


B3 IR 


Double-to-Float 
Simplify (folding, CFG, etc) 
LICM 
Global CSE 
owitch Inference 
Tail Duplication 
Path Constants 
Macro Lowering 
Legalization 
Constant Motion 


Lower to Air (isel) 


Alr 


Simplify CFG 
Macro Lowering 
DCE 
Graph Coloring Reg Alloc 
spl. SE 
Graph Coloring Stack Alloc 
Report Used Registers 
Fix Partial Register Stalls 
Lower Multiple Entrypoints 
select Block Order 


Emit Machine Code 


FTL optimization pipeline 


B3 IR Air 
Double-to-Float Smpn cra 
Simplify (folding, CFG, etc) Macro Lowering 
LICM ERE 
Global CSE Graph Coloring Reg Alloc 
Switch Inference SP SE 
Tail Duplication Graph Coloring Stack Alloc 
Path Constants Report Used Registers 
Macro Lowering Fix Partial Register Stalls 
Legalization Lower Multiple Entrypoints 
Constant Motion Select Block Order 


Lower to Air (isel) Emit Machine Code 


FTL optimization pipeline 


DFG IR 


B3 IR 


Double-to-Float 
simplify (folding, CFG, etc) 
LICM 
Global CSE 
Switch Inference 
Tail Duplication 
Path Constants 
Macro Lowering 
Legalization 
Constant Motion 


Lower to Air (isel) 


Air 


Simplify CFG 
Macro Lowering 
DCE 
Graph Coloring Reg Alloc 
SPESE 

Graph Coloring Stack Alloc 

Heport Used Registers 

Fix Partial Register Stalls 
Lower Multiple Entrypoints 

Select Block Order 


Emit Machine Code 


Source 


function foo(a, b, c) 


1 


return a + b + c; 


Bytecode 


| 0] enter 

[ 1] get scope Loc3 

[ 3] mov loc4, loc3 

| 61 check traps 

[ Z] add loco, argl, arg? 
[ 12] add loc6, loco, arg3 


| 17] ret loco 


24: 
25: 
26: 
27: 
29: 
30: 
31: 
33: 


DFG IR 


GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 
GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 
ArithAdd(Int32:@24, Int32:025, CheckOverflow, Exits, bc#7) 
MovHint(Untyped:@26, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 
GetLocal(Untyped:@3, arg3(D<Int32>/FlushedInt32), R:Stack(8), bc#12) 
ArithAdd(Int32:@26, Int32:@29, CheckOverflow, Exits, bc#12) 
MovHint(Untyped:@30, loc6, W:SideState, ClobbersExit, bc#12, ExitInvalid) 
Return(Untyped:03, W:SideState, Exits, bc#17) 


DFG IR 


24: GetLocalCUntyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 
25: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 
26: ArithAdd(Int32:@24, Int32:@25, CheckOverflow, Exits, bc#7 


29: GetLocal(Untyped:@3, arg3(D<Int32>/FlushedInt32), R:Stack(8), bc#12) 
30: ArıthAdd(Int32:@26, Int32:029, CheckOverflow, Exits, bc#12) 

31: MovHint(Untyped:@30, loco, W:SideState, ClobbersExit, bc#12, ExitInvalid) 
33: Return(Untyped:@3, W:SideState, Exits, bc#17) 


B3 IR 


Int32 @42 = Trunc(032, DFG:026) 

Int32 @43 = Trunc(027, DFG:026) 

Int32 044 = CheckAdd(@42 :WarmAny, @43:WarmAny, generator = 0x1052c5cd0, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
ExitsSideways|Reads:Top, DFG:@26) 

Int32 045 = Trunc(@22, DFG:@30) 

Int32 046 = CheckAdd(@44:WarmAny, @45:WarmAny, ©44:ColdAny, generator = @x1052c5d70, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
Ex1tsSideways|Reads:Top, DFG:@30) 

Int64 @47 = ZExt32(@46, DFG:@32) 

Int64 048 = Add(@47, $-281474976710656(@13), DFG:@32) 

Void @49 = Return(@48, Terminal, DFG:@32) 


B3 IR 


Int32 042 = Trunc(@32, DFG:@26) 

Int32 @43 = Trunc(@27, DFG:@26) 

Int32 044 = CheckAdd(@42 :WarmAny, @43:WarmAny, generator = 0x1052c5cd0, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
ExitsSideways | Reads: Top, DFG:@26) 

Int32 045 = Trunc(@22, DFG:030) 

Int32 046 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
ExitsSideways|Reads:Top, DFG:@30) 

Int64 @47 = ZExt32(@46, DFG:@32) 

Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) 

Void @49 = Return(@48, Terminal, DFG:@32) 


B3 IR 


Int32 042 = Trunc(@32, DFG:@26) 

Int32 @43 = Trunc(@27, DFG:@26) 

Int32 044 = CheckAdd(@42 :WarmAny, @43:WarmAny, generator = 0x1052c5cd0, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
Ex1tsSideways|Reads:Top, DFG:@26) 

Int32 045 = Trunc(@22, DFG:030) 

Int32 046 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
ExitsSidewayslReads:Top, DFG:@30) 

Int64 @47 = ZExt32(@46, DFG:@32) 

Int64 @48 = Add(@47, $-281474976710656(013), DFG:@32) 

Void @49 = Return(@48, Terminal, DFG:@32) 


Int32 042 = Trunc(032, DFG:026) 

Int32 043 = Trunc(@27, DFG:026) 

Int32 044 = CheckAdd(@42 :WarmAny, @43:WarmAny, generator = 0x1052c5cd0, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
Ex1tsSideways|Reads:Top, DFG:@26) 

Int32 045 = Trunc(@22, DFG:030) 

Int32 046 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
ExitsSi deways |Reads: Top, DFG:@30) 

Into4 047 = ZExt32(@46, DFG:032) 

Int64 048 = Add(@47, $-281474976710656(@13), DFG:@32) 

Void 049 = Return(@48, Terminal, DFG:@32) 


Int32 042 = Trunc(032, DFG:026) 

Int32 043 = Trunc(@27, DFG:026) 

Int32 044 = CheckAdd(@42 :WarmAny, @43:WarmAny, generator = 0x1052c5cd0, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
Ex1tsSideways|Reads:Top, DFG:@26) 


Int32 @45 = Trunc : : 

Int32 046 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
ExitsSi deways |Reads: Top, DFG:@30) 

Int64 047 = ZExt32(@46, DFG:@32) 

Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) 

Void 049 = Return(@48, Terminal, DFG:@32) 


26: ArithAdd(Int32:@24, Int32:@25, CheckOverflow, Exits, bc#7) 
27: MovHint(Untyped:026, loco, W:SideState, ClobbersExit, bc#7, ExitInvalid) 
30: ArithAdd(Int32:@26, Int32:@29, CheckOverflow, Exits, bc#12) 


Int32 @42 = Trunc(@32, DFG:@26) 

Int32 @43 = Trunc(@27, DFG:@26) 

Int32 044 = CheckAdd(@42:WarmAny, @43:WarmAny, generator = 0x1052c5cd0, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
Ex1tsSideways|Reads:Top, DFG:@26) 


Int32 @45 = Trunc : : 

Int32 046 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
ExitsSi deways |Reads: Top, DFG:@30) 

Int64 047 = ZExt32(@46, DFG:@32) 

Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) 

Void 049 = Return(@48, Terminal, DFG:@32) 


26: ArithAddCInt32:@24, Int32:@25, CheckOverflow, Exits, bc#7) 


30: ArithAdd(Int32:@26, Int32:@29, CheckOverflow, Exits, bc#12) 


Int32 042 = Trunc(@32, DFG:026) 

Int32 043 = Trunc(@27, DFG:026) 

Int32 044 = CheckAdd(@42 :WarmAny, @43:WarmAny, generator = 0x1052c5cd0, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
Ex1tsSideways|Reads:Top, DFG:@26) 


Int32 @45 = Trunc : : 

Int32 046 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
ExitsSi deways |Reads: Top, DFG:@30) 

Int64 047 = ZExt32(@46, DFG:@32) 

Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) 

Void 049 = Return(@48, Terminal, DFG:@32) 


26: ArithAddCInt32:@24, Int32:@25, CheckOverflow, Exits, bc#7) 


30: ArithAdd(Int32:@26, Int32:@29, CheckOverflow, Exits, bc#12) 


Int32 042 = Trunc(@32, DFG:026) 

Int32 @43 = Trunc(@27, DFG:@26) 

Int32 044 = CheckAdd(@42 :WarmAny, @43:WarmAny, generator = 0x1052c5cd0, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
Ex1tsSideways|Reads:Top, DFG:@26) 


Int32 @45 = Trunc : : 

Int32 046 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, 
earlyClobbered = [], lateClobbered = [], usedRegisters = [], 
ExitsSi deways |Reads: Top, DFG:@30) 

Int64 047 = ZExt32(@46, DFG:@32) 

Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) 

Void 049 = Return(@48, Terminal, DFG:@32) 


Y / generator 


vus B3 IR 


CheckAdd(@left, @right, @arg0, @argl, @arg2, .., 
generator = 0x...) 


JSC::FTL::OSRExitDescriptor 


CheckAdd(@left, Bright, @arg®, @argl, @arg2, .., 
generator = 0x...) 


JSC::FTL::OSRExitDescriptor 


| 


CheckAdd(@left, @right, @arg0, @argl, @arg2, .., 
generator = 0x...) 


JSC::FTL::OSRExitDescriptor 


| 


CheckAdd(@left, @right, @arg0, @argl, @arg2, .., 
generator = 0x...) 


JSC::FTL::OSRExitDescriptor 


| 


CheckAdd(@left, @right, @arg0, @argl, @arg2, .., 
generator = 0x...) 


JSC::FTL::OSRExitDescriptor 


| | | | 
= 


CheckAdd(@left, @right, @arg0, @argl, @arg2, .., 
generator = 0x...) 


Air backend 


Patch &BranchAdd32 Overflow, %left, Bright, %dst, 
%arg0, %argl, %arg2, .., 
generator = 0x...) 


JSC::FTL::OSRExitDescriptor 


CheckAdd(@left, @right, @arg0, @argl, @arg2, 
generator = 0x. 


Air backend 


%arg0, %argl, %arg2, .., 
generator = Ox...) 


eee b) 


ChAdd32 Overflow, %left, %right, %dst, 


JSC::FTL::OSRExitDescriptor 


CheckAdd(@left, @right, @arg0, @argl, @arg2, 
generator = 0x. 


ackend 


%arg0, %argl, %arg2, .., 
generator = Ox...) 


eee b) 


ChAdd32 Overflow, %left, %right, %dst, 


JSC::FTL: sOSREMIDescriptor 


CheckAdd(@left, Bright, @arg0, @argl, @arg2, .., 
generator = 0x. 


ChAdd32 Overflön, 
%arg0, %argl, %arg2, .., 
generator = 0x...) 


%left, %right, %dst, 


JSC::FTL::OSRExitDescriptor 


CheckAdd(@left, @right, @arg0, @argl, @arg2, .., 
generator = 0x. 


%rcx , %r11 , %rax , …, 
generator < 0x..) 


JSC::FTL: sOSREMIDescriptor 


CheckAdd(@left, Bright, @arg0, @argl, @arg2, .., 
generator = 0x. 


ChAdd32 Overflow, 
%rcx , %r11 , %rax , …, 
generator < 0x..) 


left, %right, %dst, 


JSC::FTL: sOSREMIDescriptor 


CheckAdd(@left, Bright, @arg0, @argl, @arg2, .., 
generator = 0x. 


ChAdd32 Overflow, 
%rcx , %r11 , %rax , …, 
generator < 0x..) 


left, %right, %dst, 


JSC::FTL: sOSREMIDescriptor 


krax | %r11 


CheckAdd(@left, Bright, @arg0, @argl, @arg2, .., 
generator = 0x. 


ChAdd32 Overflow, 
%rcx , %r11 , %rax , …, 
generator < 0x..) 


left, %right, %dst, 


B3 IR 


B3 IR 


Machine Code 


+ 


B3 IR 


E 


— 
— 


Machine Code 


Machine Code 


Machine Code 


Machine Code 


Machine Code 


Trunc Trunc 


Check A 
Add Ü 


B3 IR 4 point È 


Machine Code 


Check 
Add 


Trunc — Trunc j À k 


Machine Code 


inline void x86. cpuid() 
1 
intptrta=0,b, c, d; 
asm volatile( 
"cpuid" 
: "sa (a), "=b"(b), "zc'(C), "zd" (d) 


: “memory ); 


if (MacroAssemblerARM64: : 
supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) 1 
PatchpointValue* patchpoint = m out.patchpoint(Int32); 
patchpoint->appendSomeRegister(doubleValue); 
patchpoint->setGenerator( 
[=] (CCallHelpers& jit, 
const StackmapGenerationParams& params) { 
Jit.convertDoubleToInt32UsingJavaScriptSemantics( 
params[1].fpr(), params[0].gpr()); 
3); 
patchpoint->effects = Effects::none(); 
return patchpoint; 


if (MacroAssemblerARM64: : 
supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) 1 
PatchpointValue* patchpoint = m out.patchpoint(Int32); 
patchpoint->appendSomeRegister(doubleValue); 
patchpoint->setGenerator( 
[=] (CCallHelpers& jit, 
const StackmapGenerationParams& params) 4 
Jit.convertDoubleToInt32UsingJavaScriptSemantics( 
params[1].fprC), params[0].gpr()); 
3); 
patchpoint->effects = Effects: :none(): 
return patchpoint; 


if (MacroAssemblerARM64: : 
supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) 1 
PatchpointValue* patchpoint = m out.patchpoint(Int32); 
patchpoint->appendSomeRegister(doubleValue); 
patchpoint->setGenerator( 
[=] (CCallHelpers& jit, 
const StackmapGenerationParams& params) + 
Jit.convertDoubleToInt32UsingJavaScriptSemantics( 
params[1].fprC), params[0].gpr()); 
3); 
patchpoint->effects = Effects: :none(): 
return patchpoint; 


if (MacroAssemblerARM64: : 
supportsDoubleToInt32ConversionUsingJavaScriptSemanticsO)) í 
PatchpointValue* patchpoint = m out.patchpoint(Int32); 
patchpoint->appendSomeRegister(doubleValue); 
patchpoint->setGenerator( 
[=] (CCallHelpers& jit, 
const StackmapGenerationParams& params) 4 
JLt. convertDoubleTolnt32Us1ngJavaScriptSemantics( 
params[1].fprO, params[0].gpr()); 
$); 
patchpoint->effects = Effects: :none(): 
return patchpoint; 


if (MacroAssemblerARM64: : 
supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) 1 
PatchpointValue* patchpoint = m out.patchpoint(Int32); 
patchpoint->appendSomeRegister(doubleValue); 
patchpoint->setGenerator( 
[=] (CCallHelpers& jit, 
const StackmapGenerationParams& params) { 
JLt. convertDoubleTolnt32Us1ngJavaScriptSemantics( 
params[1].fpr(), params[0].gpr()); 
$); 
patchpoint->effects = Effects: :none(): 
return patchpoint; 


if (MacroAssemblerARM64: : 
supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) 4 
PatchpointValue* patchpoint = m out.patchpoint(Int32); 
patchpoint->appendSomeRegister(doubleValue); 
patchpoint->setGenerator( 
[=] (CCallHelpers& jit, 
const StackmapGenerationParams& params) { 
Jit.convertDoubleToInt32UsingJavaScriptSemantics( 
params[1].fprO, params[0].gpr()); 
$); 
patchpoint->effects = Effects: :none(): 
return patchpoint; 


if (MacroAssemblerARM64: : 
supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) 1 
PatchpointValue* patchpoint = m out.patchpoint(Int32); 
patchpoint->appendSomeRegister(doubleValue); 
patchpoint->setGenerator( 
[=] (CCallHelpers& jit, 
const StackmapGenerationParams& params) 4 
jit.convertDoubleToInt32UsingJavaScriptSemantics( 
params[1].fpr(), params[0].gpr()); 
3); 
patchpoint->effects = Effects: :none(), 
return patchpoint; 


Patchpoint Use Cases 


Polymorphic inline caches 
Calls with interesting calling conventions 
Lazy slow paths 


Interesting Instructions 


Patchpoint Use Cases 


Patchpoint Use Cases 


Patchpoint Use Cases 


Patchpoint Use Cases 


Patchpoint Use Cases 


DFG FIL 


Fast JIT Powerful JIT 


DFG SSA IR 


B3 IR 


Assembly IR 


Speculation in JSC 


